Debugging Tool for Localizing Faulty Processes in Message Passing Programs

نویسندگان

  • Masao Okita
  • Fumihiko Ino
  • Kenichi Hagihara
چکیده

In message passing programs, once a process terminates with an unexpected error, the terminated process can propagate the error to the rest of processes through communication dependencies, resulting in a program failure. Therefore, to locate faults, developers must identify the group of processes involved in the original error and faulty processes that activate faults. This paper presents a novel debugging tool, named MPI-PreDebugger (MPI-PD), for localizing faulty processes in message passing programs. MPI-PD automatically distinguishes the original and the propagated errors by checking communication errors during program execution. If MPI-PD observes any communication errors, it backtraces communication dependencies and points out potential faulty processes in a timeline view. We also introduce three case studies, in which MPI-PD has been shown to play the key role in their debugging. From these studies, we believe that MPI-PD helps developers to locate faults and allows them to concentrate in correcting their programs.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Debugging Message Passing Programs Using Invisible Message Tags

Source level debuggers for parallel PVM or MPI programs currently ooer good support for debugging multiple processes, however, they still lack adequate mechanisms for debugging message passing errors. In this paper, we present a new concept called message breakpoints, which allows to follow the information ow between processes. We also show how these breakpoints can be implemented very eecientl...

متن کامل

MARMOT: An MPI Analysis and Checking Tool

The Message Passing Interface (MPI) is widely used to write parallel programs using message passing. MARMOT is a tool to aid in the development and debugging of MPI programs. This paper presents the situations where incorrect usage of MPI by the application programmer is automatically detected. Examples are the introduction of irreproducibility, deadlocks and incorrect management of resources l...

متن کامل

Towards Visual Development of Message-Passing Programs

Writing and managing programs for parallel systems is a difficult task. It is a great challenge for designers of visual programming languages to provide tools that will help in the process. This paper describes a new graph based tool called Visper that provides a multidimensional environment for program composition. Our approach combines different levels of abstraction at which parallel program...

متن کامل

The Distributed Application Debugger

Developing parallel programs which run on distributed computer clusters introduces additional challenges to those present in traditional sequential programs. Debugging parallel programs requires not only inspecting the sequential code executing on each node but also tracking the flow of messages being passed between them in order to infer where the source of a bug actually lies. This thesis foc...

متن کامل

A Preliminary Topological Debugger for MPI Programs

Most parallel programs use regular topologies to support their computation. Since they define the relationship between processes, process topologies present an excellent opportunity for debugging. The primary benefit is that patterns of expected behaviour can be abstracted and identified, and unexpected behaviour reported. However, topology support is inadequate in many environments, including ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره cs.SE/0310015  شماره 

صفحات  -

تاریخ انتشار 2003